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Maximum Persistency via Iterative Relaxed 
Inference in Graphical Models 

Alexander Shekhovtsov, Paul Swoboda, and Bogdan Savchynskyy 


Abstract —We consider the NP-hard problem of MAP-inference for undirected discrete graphical models. We propose a polynomial 
time and practically efficient algorithm for finding a part of its optimal solution. Specifically, our algorithm marks some labels of the 
considered graphical model either as (i) optimal, meaning that they belong to all optimal solutions of the inference problem; 

(ii) non-optimal if they provably do not belong to any solution. With access to an exact solver of a linear programming relaxation to the 
MAP-inference problem, our algorithm marks the maximal possible (in a specified sense) number of labels. We also present a version 
of the algorithm, which has access to a suboptimal dual solver only and still can ensure the (non-)optimality for the marked labels, 
although the overall number of the marked labels may decrease. We propose an efficient implementation, which runs in time 
comparable to a single run of a suboptimal dual solver. Our method is well-scalable and shows state-of-the-art results on 
computational benchmarks from machine learning and computer vision. 

Index Terms —Persistency, partial optimality, LP relaxation, discrete optimization, WCSP, graphical models, energy minimization 
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1 Introduction 

W E consider the energy minimization or maximum a posteri¬ 
ori (MAP) inference problem for discrete graphical models. 
In the most common pairwise case it has the form 

minE f (x) := f 0 + V f v (x v ) + V f uv (x u ,x v ), (1) 

z z ' 

vEV uvES 

where minimization is performed over vectors x , containing the 
discrete-valued components x v . Further notation is to be detailed 
in § 2. The problem has numerous applications in computer vi¬ 
sion, machine learning, communication theory, signal processing, 
information retrieval and statistical physics, see [18, 47, 29] for 
an overview of applications. Even in the binary case, when each 
coordinate of x can be assigned two values only, the problem is 
known to be NP-hard and is also hard to approximate [27]. 

Hardness of the problem justifies a number of existing approx¬ 
imate methods addressing it [18]. Among them, solvers addressing 
its linear programming (LP) relaxations and in particular, the LP 
dual [41, 49, 20], count among the most versatile and efficient 
ones. However, apart from some notable exceptions (see the 
overview of related work below), approximate methods can not 
guarantee neither optimality of their solutions as a whole, nor even 
optimality of any individual solution coordinates. That is, if x is a 
solution returned by an approximate method and x* is an optimal 
one, there is no guarantee that x* = x v for any coordinate v. 

In contrast, our method provides such guarantees for some 
coordinates. More precisely, for each component x v it eliminates 
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those of its values (henceforth called labels ), which provably can 
not belong to any optimal solution. We call these eliminated 
labels persistent non-optimal. Should a single label a remain 
non-eliminated, it implies that for all optimal solutions x* it 
holds x* v = a and the label a is called persistent optimal. Our 
elimination method is polynomial and is applicable with any 
(approximate) solver for a dual of a linear programming relaxation 
of the problem, employed as a subroutine. 

1.1 Related Work 

A trivial but essential observation is that any method identifying 
persistency has to be based on tractable sufficient conditions in 
order to avoid solving the NP-hard problem (1). 

Dead-end elimination methods (DEE) [10] verify local 
sufficient conditions by inspecting a given node and its immediate 
neighbors at a time. When a label in the node can be substituted 
with another one such that the energy for all configurations of the 
neighbors does not increase, this label can be eliminated without 
loss of optimality. 

A similar principle for eliminating interchangeable labels was 
proposed in constraint programming [11]. It’s generalization to 
a related problem of Weighted Constraint Satisfaction (WCS) is 
known as dominance rules or soft neighborhood substitutability. 
However, because the WCS in general considers a bounded “+” 
operation, the condition appears to be intractable and therefore 
weaker sufficient local conditions were introduced, e.g., [26, 9]. 
The way [9] selects a local substitute label using equivalence 
preserving transforms is related to our method, in which we use 
an approximate solution based on the dual of the LP relaxation as 
a tentative substitute (or test) labeling. 

Although the local character of the DEE methods allows for an 
efficient implementation, it also significantly limits their quality, 
i.e., the number of found persistencies. As shown in [35, 43, 48], 
considering more global criteria may significantly increase the 
algorithm’s quality. 

The roof dual relaxation in quadratic pseudo-Boolean Opti¬ 
mization (QPBO) [5, 30] (equivalent to pairwise energy minimiza- 
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Fig. 1: Progress of partial optimality methods. The top row corresponds to a stereo model with Potts interactions and large aggregating 
windows for unary costs used in [24, 4] (instance published by [4]). The bottom row is a more refined stereo model with truncated linear 
terms [45] (instance from [1]). The hashed red area indicates that the optimal persistent label in the pixel is not found (but some non-optimal 
labels might have been eliminated). Solution completeness is given by the percentage of persistent labels. Graph cut based methods are fast 
but only efficient for strong unary terms. LP-based methods are able to determine larger persistent assignments but are extremely slow prior to 
this work. 


tion with binary variables) has the property that all variables that 
are integer in the relaxed solution are persistent. Several general¬ 
izations of roof duality to higher-order energies were proposed 
(< e.g ., [2, 21]). The MQPBO method [19] and the generalized 
roof duality [50] extend roof duality to the multi-label case by 
reducing the problem to binary variables and generalizing the 
concept of submodular relaxation [21], respectively. Although 
for binary pairwise energies these methods provide a very good 
trade-off between computational efficiency and a number of found 
persistencies, their efficacy drops as the number of label grows. 

Auxiliary submodular problems were proposed in [24, 25] 
as a sufficient persistency condition for multilabel energy min¬ 
imization. In the case of Potts model, the method has a very 
efficient specialized algorithm [14]. Although these methods have 
shown very good efficacy for certain problem classes appearing in 
computer vision, the number of persistencies they find drastically 
decreases when the energy does not have strong unary terms (see 
Fig. 1). 

In contrast to the above methods that technically rely either 
on local conditions or on computing a maximum flow (min-cut), 
the works [42, 43, 44] and [35] proposed persistency approaches 
relying on a general linear programming relaxation. Authors 
of [42, 43, 44] demonstrated applicability of their approach to 
large-scale problems by utilizing existing efficient approximate 
MAP-inference algorithms, while in [35] the large-scale problems 
are addressed using a windowing technique. Despite the superior 
persistency results, the running time of the approximate-LP-based 
methods remained prohibitively slow for practical applications as 
illustrated by an example in Fig. 1. 

Not only LP-based methods can achieve superior results in 
practice, but they are even theoretically guaranteed to do so, as 
proven for the method [35, 37]. In this method, the problem of 
determining the maximum number of persistencies is formulated 
as a polynomially solvable linear program. It is guaranteed to find 
a provably larger persistency assignment than most of the above 


mentioned approaches. However, solving this linear program for 
large scale instances is numerically unstable/intractable and apply¬ 
ing it to multiple local windows is prohibitively slow. This poses a 
challenge of designing an LP-based method that would be indeed 
practical. 

1.2 Contribution 

In this work we propose a method which solves the same maximum 
persistency problem as in [35] and therefore delivers provably 
better results than other methods. Similar to [44], our method 
requires to iteratively (approximately) solve the linear program¬ 
ming relaxation of (1) as a subroutine. However, our method is 
significantly faster than [35, 44] due to a substantial theoretical 
and algorithmic elaboration of this subroutine. 

We demonstrate the efficiency of our approach on benchmark 
problems from machine learning and computer vision. We outper¬ 
form all competing methods in terms of the number of persistent 
labels and method [35] in speed and scalability. On randomly 
generated small problems, we show that the set of persistent labels 
found using approximate LP solver is close to the maximal one as 
established by the (costly and not scalable) method [35]. 

The present paper is a revised version of [39]. Besides re¬ 
worked explanations, shortened and clarified proofs, one new tech¬ 
nical extension is a more general dual algorithm, with termination 
guarantees for a larger class of approximate solvers. 

2 Work Overview 

This section serves as an overview of our method, where we give 
the most general definitions, formulate the maximum persistency 
problem and briefly describe a generic method to solve it. This 
description, equipped with references to subsequent sections, 
should serve as a road map for the rest of the paper. 
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Fig. 2: Dead end elimination / dominance. Variables are shown as 
boxes and their possible labels as circles. Label x u — 1 is substituted 
with label x u — 3. If for any configuration of neighbors xj^^ u ) the 
energy does not increase (only the terms inside (w}UA/"(w) contribute 
to the difference), label x u — 1 can be eliminated without loss of 
optimality. 
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Fig. 3: Simultaneous substitution of labels in two variables. X u — 
X v — {1,2,3}, labels at arrow tails are substituted with labels at 
arrow heads. So the joint configuration (1,2) (dashed) is substituted 
with the configuration (3,3) (solid). 

2.1 Notation 

In the MAP-Inference Problem (1) we assume (V,£) to be 
a directed graph with the set of nodes V and the set of 
edges £ C V x V. Let uv denote an ordered pair ( u , v ) and 
A f(u) = {v | uv E £ V vu E £} stands for the set of neighbors 
of u. Each node v E V is associated with a variable x v taking 
its values in a finite set of labels X v . Cost functions or potentials 
f v \X v -A M, f uv : X u x X v -A R are associated with nodes 
and edges respectively. Let f 0 E R be a constant term , which 
we introduce for the sake of notation. Finally, X stands for the 
Cartesian product ELey and its elements x E X are called 
labelings. 

We represent all potentials of energy (1) by a single cost vector 
f E R x , where the set X enumerates all components of all terms: 

X — {0} U {(u, i) | u E V, i E A n } U {(m;, if) \ uv e £, i e 
Ak, j C A^}. 

2.2 Improving Substitutions 

We formulate our persistency method in the framework of 
(strictly) improving substitutions, called improving mappings in 
our previous works [35, 37]. It was shown in [35] that most 
existing persistency techniques can be expressed as improving 
substitutions. A mapping p: X -A A' is called a substitution , 
if it is idempotent, i.e., p(x) = p(p(x)). 

Definition 2.1. A substitution p: X -A A’ is called strictly 
improving for the cost vector / if 

(Vz I p(x) ± x) E f (p(x)) < Ef(x). (2) 

When a strictly improving substitution is applied to any 
labeling x , it is guaranteed that p(x) has equal or better energy. 
In particular, strictly improving substitutions generalize the strong 


autarky property [5]. When applied to the whole search space X 
we obtain its image p{X) - a potentially smaller search space 
containing all optimal labelings. 

In what follows we will restrict ourselves to node-wise substi¬ 
tutions, i.e., those defined locally for each node: p(x) u = p u { x u)> 
where p u : X u -A X u . Indeed, already this class of substitutions 
covers most existing persistency methods. 

Example 2.1. Let us consider the dead-end elimination 
(DEE) [10, 13]. It is a test whether a given label in a single node, 
e.g., x u = 1 in Fig. 2 can be substituted with another one, e.g., 
x u = 3 in Fig. 2. The change of the energy under this substitution 
depends only on the configuration of neighbors Xjg^ u p and the 
value of the change is additive in neighbors, so that it can be 
verified for all 2W( n ) whether the substitution always improves 
the energy. If it is so, the label x u = 1 can be eliminated and the 
test is repeated for a different label in the reduced problem. 

A general substitution we consider is applied to labels in all 
nodes simultaneously, as illustrated in Fig. 3 for two variables. We 
obtain the following principle for identifying persistencies. 

Proposition 2.2. If p is a strictly improving substitution, then any 
optimal solution x* of (1) must satisfy (f/v E V) p v {xf) = x*. 

Indeed, otherwise Ef(p(x*)) < Ef(x *), which is a contra¬ 
diction. If p v (i) f i , then idempotency implies that label (v , i) is 
non-optimal persistent and can be excluded from consideration. 

2.3 Verification Problem 

Verifying whether a given substitution is strictly improving is an 
NP hard decision problem [35]. In order to obtain a polynomial 
sufficient condition we will first rewrite (2) as an energy mini¬ 
mization problem and then relax it. To this end we reformulate 
Definition 2.1 in an optimization form: 

Proposition 2.3. Substitution p is strictly improving iff 

min ( Ef(x ) — Ef(p(x))) > 0, (3a) 

p(x) = x for all minimizers. (3b) 

Proof. Indeed, condition (3a) is equivalent to (\/x) Ef{x) > 
Ef(p(x)). Sufficiency: if x f p(x), then x is not a minimizer 
and Ef(x) > Ef(p(x)). Necessity: for x = p(x) we have that 
Ef(x) = Ef(jp{x )), therefore from Definition 2.1 it follows that 
condition (3a) holds and any x = p(x) is a minimizer, moreover, 
for any minimizer x it must be Ef(x) — Ef(p(x)) = 0 and from 
Definition 2.1 it follows x = p(x). 

□ 

In § 3 we will show that the difference of the energies in (3 a) 
can be represented as a pairwise energy with an appropriately 
constructed cost vector g so that there holds 

E f ( x ) - E f ( P( x )) = E g( x )- ( 4 ) 

Therefore, according to Proposition 2.3 the verification of the 
strictly improving property reduces to minimizing the energy (4) 
and checking that (3b) is fulfilled. To make the verification 
problem tractable , we relax it as 

mmEg(p) > 0, (5a) 

ueA 

p(p) = p for all minimizers, (5b) 

where A is a tractable polytope such that its integer vertices 
correspond to labelings (the standard LP relaxation that we use 
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will be defined in § 5), p is a relaxed labeling and E g (p) and 
p(/j,) are appropriately defined extensions of discrete functions 
E g {x ) and p(x), defined in § 3. By construction, the objective 
value (5) matches exactly to that of (2) for all integer labelings 
p G A fl {0, l} 2 ", which is sufficient for (2) to hold. The sufficient 
condition (5) (made precise in Definition 3.2) means thatp strictly 
improves not only integer labelings but also all relaxed labelings 
and therefore such substitutions will be called strictly relaxed- 
improving for a cost vector /. Assuming A is fixed by the context, 
let §f denotes the set of all substitutions p satisfying (5). 


2.4 Maximum Persistency and Subset-to-One Substitu¬ 
tions 

The maximum persistency approach [35] consists of finding a 
relaxed-improving substitution p G §/ that eliminates the maximal 
number of labels: 


max |{i G X v \ p(i) ^ i} I, s.t. p G §/ , (6) 

pev “ 

vEV 

where V is a class of substitutions. 

While maximizing over all substitutions is not tractable, max¬ 
imizing over the following restricted class is [35]. Assume we are 
given a test labeling y, which in our case will be an approximate 
solution of the MAP inference (1). We then consider substituting 
in each node v a subset of labels with y v . 


Definition 2.4 ([35]). A substitution p is in the class of subset- 
to-one substitutions V 2 ' y , where y G X, if there exist subsets 
y v C X v \{y v } for all v such that 



if i G y v ; 

if i £ y v • 


(7) 


See Figs. 3 and 4 for examples. Note, this class is rather large: 
there are 2^ I -1 possible choices for p v and most of the existing 
methods for partial optimality still can be represented using it [35] 
(in particular, methods [25, 14] can be represented using a constant 
test labeling, y v = a for all v). 

The restriction to the class V 2,y allows to represent the search 
of the substitution that eliminates the maximum number of labels 
as the one with the largest (by inclusion) sets y v of substituted 
labels. This allows to propose a relatively simple algorithm. 


2.5 Cutting Plane Algorithm 

The algorithm is a cutting plane method in a general sense: we 
maintain a substitution p l which is in all iterations better or 
equal than the solution to (6) and achieve feasibility by iteratively 
constraining it. 

Initialization: Define the substitution p° by the sets y® = 
X v \{y v }. It substitutes everything with y and clearly maxi¬ 
mizes the objective (6). 

Verification: Check whether current p l is strictly relaxed- 
improving for / by solving relaxed problem (5). If yes, 
return p l . If not, the optimal relaxed solution p* corresponds 
to the most violated constraint. 

Cutting plane: Assign p t+1 to the substitution defined by the 
largest sets 3^ +1 such that 3^ +1 C y* and the constraints 
E g (p*) > 0, p(p*) = p*, are satisfied. Repeat the verifica¬ 
tion step. 

The steps of this meta-algorithm are illustrated in Fig. 4. It is 
clear that when the algorithm stops the substitution p l is strictly 
improving, although it could be the identity map that does not 



Fig. 4: Steps of the discrete cutting-plane algorithm, (a) Starting from 
substitution that maps everything to the test labeling y (red), crossed 
labels would be eliminated if p passes the sufficient condition, (b) A 
relaxed solution p* violating the sufficient condition is found (black), 
(c) Substitution p is pruned. 


eliminate any labels. The exact specification of the cutting plane 
step will be derived in § 4 and it will be shown that this algorithm 
solves the maximum persistency problem (6) over V 2,y optimally. 

2.6 Work Outline 

In § 3 we give a precise formulation of the relaxed condition (5) 
and its components. In § 4 we specify details of the algorithm and 
prove its optimality. These results hold for a general relaxation 
A D M. but require to solve linear programs (5a) precisely. 

The rest of the paper is devoted to an approximate solution 
of the problem (6), i.e. finding a relaxed improving mapping, 
which is almost maximum. We consider specifically the standard 
LP relaxation and reformulate the algorithm to use a dual solver 
for the problem (5a), § 5. We then gradually relax requirements on 
the optimality of the dual solver while keeping persistency guar¬ 
antees, §§ 5 to 7, and propose several theoretical and algorithmic 
tools to solve the series of verification problems incrementally and 
overall efficiently, § 8. Finally, we provide an exhaustive experi¬ 
mental evaluation in § 9, which clearly demonstrates efficacy of 
the developed method. 


3 Relaxed-Improving Substitutions 
3.1 Overcomplete Representation 

In this section we formally derive the strictly relaxed-improving 
sufficient condition (5). To obtain the relaxation we use the stan¬ 
dard lifting approach (a.k.a. overcomplete representation [47]), in 
which a labeling is represented using the 1-hot encoding. This 
lifting allows to linearize the energy function, the substitution and 
consequently both the non-relaxed (3) and relaxed (5) improving 
substitution criteria. 

The lifting is defined by the mapping 5: X —>• R x : 

S(x) 0 = 1, (8a) 

5(x) u (i) = lx u =ij, (8b) 

8{x)uv(i,j) = lx u =ijlx v =j ], (8c) 

where [•] is the Iverson bracket, i.e., [[,4] equals 1 if A is 
true and 0 otherwise. Using this lifting, we can linearize unary 
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terms as f v (x v ) = J2k fv[k)\x v =k\ = J2k f v (k)S(x) v (k) and 

similarly for the pairwise terms. This allows to linearize the energy 
function Ef and write it as a scalar product Ef(x ) = (/, S(x)) 
in R x . The energy minimization problem (1) can then be written 
as 

min E f (x) = min (/, 5(x)) = min (/, p) , (9) 

xEX xEX h GA4 

where A4 = conv5(Af) is the convex hull of all labelings in the 
lifted space, also known as marginal polytope [47]. The last equal¬ 
ity in (9) uses the fact that the minimum of a linear function on a 
finite set equals the minimum on its convex hull. Expression (9) is 
an equivalent reformulation of the energy minimization problem 
as a linear program, however over a generally intractable polytope 

M. 


3.2 Lifting of Substitutions 

Next we show how a substitution p : X —> X can be represented 
as a linear map in the lifted space R x . This will allow to express 
the term Ef{jp(x)) as a linear function of S(x) and hence also to 
represent the non-relaxed criterion (3). 

Proposition 3.1. Given a substitution p , let P T : R x —>> R x be 
defined by its action on a cost vector / G R x as follows: 


(P T f )0 — / 0 5 dOa) 

(P T f)u(i) = fu(Pu(i)), (10b) 

(P T f)uv(i,j) = fuv(Pu(i),Pv(j)) (10c) 

V u G V, uv G £, ij G X uv . Then P satisfies 

(\/x G X) S(p(x)) = P5(x). (11) 


Proof Let x G X. From (10) it follows that E P jf(x) = 
Ef(p(x)), which can be expressed as a scalar product 

(P T f,S(x)) = (/, PS(x)} = ( f,S(p(x ))). 

Since this equality holds for all / G R x it follows that PS(x) = 
5(p(x)). □ 

The expression (11) allows to write the energy of the substi¬ 
tuted labeling p{pc) as 

Ef(p(x )) = (. f,S(p(x ))) = ( f,PS(x )) = EpT f (x). (12) 

For this reason, the mapping P is called the linear extension of p 
and will be denoted with the symbol \p\. The following example 
illustrates how \p\ looks in coordinates. 


Example 3.1. Consider the substitution p depicted in Fig. 3 and 
defined by p u \ 1,2,3 3,3,3; p v : 1,2,3 i—>• 1,3,3. The 

relaxed labeling p G R J has the structure (/z 0 , fi u , fi v , /x uv ). The 
linear extension \p \: R x —>• R x can be written as a block-diagonal 
matrix 


( 


P u 


Pv 


Puv 


where P u = 




(13) 


and P uv is defined by P uv p U v — PuHuvPv, where fi uv is shaped 
as a 3 x 3 matrix. The action of the block P u expresses as 


Pu 



i.e. all relaxed labels are mapped to the indicator of the label 
x u = 3. And the adjoint operator P T acts as follows (cfi (10b)): 

P u (/™(1) /u(2) /n(3) ) = (/„(3) f u ( 3) f u ( 3)) , (15) 

Pj ( /„(1) /„(2) /„(3) ) T = ( /„( 1) /„(3) /„(3) ) T . (16) 

Similarly, due to (10c) we have, e.g., (Pj v f uv )(l, 2) = f uv ( 3,3), 
(Pj v fuv){ 1,1) = fuv( 3,1) and so on. □ 


3.3 Strictly improving substitutions 

Let I denote the identity mapping R 1 —> R x . Using Proposi¬ 
tion 2.3 and the linear extension [p], we obtain that substitution p 
is strictly improving iff the value of 

min (/, 5(x) - S(p(x))) = min (/, (/ - \p])S(x)) 

xEX xEX 

= min((J- \p]) T f,6(x)) = min((J- [p]) T /,m) (17) 

xex fieM 

is zero and \p\fi = fi for all minimizers. Note that problem (17) 
is of the same form as the energy minimization (9) with the cost 
vector g = (/ — [ p\ T )f , as introduced in (4). 

The sufficient condition for persistency (5) is obtained by 
relaxing the intractable marginal poly tope M. in (17) to a tractable 
outer approximation A D AT 

Definition 3.2 ([35]). Substitution p is strictly K-improving for 
the cost vector f G R x (shortly, strictly relaxed-improving , or 
p G §/) if 

min M€A((7 - [p]) T /, /i) = 0, (18a) 

\p\p* = /i* for all minimizers. (18b) 

In § 5, A will be defined as the poly tope of the standard LP 
relaxation but until then the arguments are general and require 
only that A D AT Since A includes all integer labelings, it 
is a sufficient condition for improving substitution and hence 
persistency. 

Corollary 3.3. If substitution p is strictly A-improving for / 
and A D J\4, then p is strictly improving for /. 

The problem (18a) will be called the verification LP and the 
decision problem to test for p G §/, i.e. to verify conditions (18), 
will be called the verification problem. 

4 Generic Persistence Algorithm 
4.1 Structure of P 2 ' y 

In [35] it was shown that the maximum persistency problem (6) 
over the class of substitutions V 2,y can be formulated as a 
single linear program, where the substitution is represented using 
auxiliary (continuous) variables. Here we take a different approach 
based on observing a lattice-like structure of improving substitu¬ 
tions. 

Throughout this section we will assume that the test labeling 
y G X is fixed. Let us compare two substitutions p and q by 
the sets of the labels they eliminate. A substitution p G V 2,y 
eliminates all labels in y v , or equivalently all labels not in p v (X v ). 


Definition 4.1. A substitution p G V 2,y is better equal than a 
substitution q G V 2,y , denoted by p > q, if (\/v G V) it holds 

Pv(Xv) C q v (X v ). 
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Proposition 4.2. Let the partially ordered set (§/ fl V 2,y , >) 
of subset-to-one strictly relaxed-improving substitutions has the 
maximum and let it be denoted r. Then r is the unique solution 
of (6) with V = T 2 ' y . 

Proof. Since r is the maximum, it holds r > q for all q. From 
Definition 4.1 we have r v (X v ) C q v (X v ) for all v and thus 
\ r v(X v )\ < ^2 v \Qv(^v)l Therefore r is optimal to (6). 
Additionally, if r ^ q, then it holds ^ \r v (X v )\ < \<1 v(Xv)\ 

and therefore r is the unique solution to (6). □ 

The existence of the maximum will formally follow from the 
correctness proof of the algorithm, Theorem 4.6. A stronger claim, 
which is not necessary for our analysis, but which may provide a 
better insight is that (§/ D V 2,y , >) is a lattice isomorphic to 
the lattice of sets with union and intersection operations. This is 
seen as follows. If both p and q are strictly-improving then so is 
their composition r(x) = p(q(x )), as can be verified by chaining 
inequalities (2). In V 2,y the composition satisfies the property 
r u (Xu) = Pu{X u ) fl q u {X u ), and can be identified with the join 
of p and q (the least r such that r > p and r > q). This can be 
shown to hold also for §/ D V 2,y . It is this structure that allows to 
find the maximum in §y D V 2,y by a relatively simple algorithm. 

4.2 Generic Algorithm 

Our generic primal algorithm, displayed in Algorithm 1, represents 
a substitution p G V 2,y by the sets y v of labels to be substituted 
with y, via (7). Line 1 initializes these sets to all labels but 
y. Line 1 constructs the cost vector of the verification LP in 
condition (18). Lines 1 and 1 solve the verification LP and test 
whether sufficient conditions (18) are satisfied via the following 
reformulation. 

Proposition 4.3. For a given substitution p , let 0* C A denote 
the set of minimizers of the verification LP (18) and 

o: := {i£ X v \ (3/x e O*) MO > 0}, (19) 

which is the support set of all optimal solutions in node v. Then 

p G §/ iff (\/v G V \/i G 0*) p v (i) = i- Proof in § A. 

Corollary 4.4. For substitution p G V 2,y defined by (7) it holds 

p G §/ iff O ev) o*ny v = 0. 

In the remainder of the paper we will relate the notation 
0* to 0* as in (19). The set of optimal solutions 0* can, in 
general, be a d-dimensional face of A. We need to determine 
in (19) whether an optimal solution p G 0* exists such that 
the coordinate p v (i) is strictly positive. From the theory of linear 
programming [46], it is known that if one takes an optimal solution 
p in the relative interior of 0* (i.e., if 0* is a 2D face the relative 
interior excludes vertices and edges of 0*) then its support set 
{i | p v (i) > 0} is the same for all such points and it matches 
0* (19). Therefore, it is practically feasible to find the support sets 
0* by using a single solution found by an interior point/barrier 
method (which are known to converge to a “central” point of the 
optimal face) or methods based on smoothing [31, 32]. Obtaining 
an exact solution by these methods may become computationally 
expensive as the size of the inference problem (1) grows. Despite 
that, Algorithm 1 is implementable and defines the baseline for its 
practically efficient variants solving (6) approximately. These are 
developed further in the paper. 

Since Line 1 of Algorithm 1 verifies precisely the condition of 
Corollary 4.4, the algorithm terminates as soon as p G §/ and 


Algorithm 1: Iterative Pruning LP-Primal 

Input: Cost vector / G R x , test labeling y G 
Output: Maximum strictly improving substitution p; 

1 (W G V) y v := X v \{y v }- 

2 while true 

3 Construct the verification problem potentials 
g := (/ — [p\) T f with p defined by (7); 

4 O* = argmin^MM; 

5 (Vi> e V) O* = {i £ X v | (3/i e O*) M*) > 0}; 

6 if (Vi> £ V) O* ny v = 0 then return p; 

7 for v G V do 

8 Pruning of substitutions: y v := 34 \0*; 


hence p is strictly improving. In the opposite case, Line 1 prunes 
the sets y v by removing labels corresponding to the support set 
0* of all optimal solutions of the verification LP, which have 
been identified now to violate the sufficient condition. These labels 
may be a part of some optimal solution to (1) and will not be 
eliminated. 

To complete the analysis of Algorithm 1 it remains to answer 
two questions: i) does it terminate and ii) is it optimal for the 
maximum persistency problem (6)? 

Proposition 4.5. Algorithm 1 runs in polynomial time and returns 
a substitution p G §/ fl V 2,y . 

Proof. As we discussed above, sets 0* in Line 1 can be found 
in polynomial time. At every iteration, if the algorithm has 
not terminated yet, at least one of the sets y v strictly shrinks, 
as can be seen by comparing termination condition in Line 1 
with pruning in Line 1. Therefore the algorithm terminates in 
at most s yy jv (\X v \ — 1) iterations. On termination, p G §/ by 
Corollary 4.4. □ 

Theorem 4.6. Substitution p returned by Algorithm 1 is the 
maximum of §y D V 2,y and thus it solves (6). Proof in ?? . 

It is noteworthy that Algorithm 1 can be used to solve prob¬ 
lem (6) with any poly tope A satisfying M. C A, i.e., with 
any LP relaxation of (1) that can be expressed in the lifted 
space R x . Moreover, in order to use the algorithm with higher 
order models one needs merely to (straightforwardly) generalize 
the linear extension (10) as done in [37]. 

The test labeling y can itself be chosen using the approximate 
solution of the LP-relaxation, e.g., via the zeroth iteration of the 
algorithm with g = f and picking y v from 0*. This choice is 
motivated by the fact that a strict relaxed-improving substitution 
cannot eliminate the labels from the support set of optimal solu¬ 
tions of the LP relaxation [35], and thus these labels may not be 
substituted with anything else. 

4.3 Comparison to Previous Work 

Substitutions in V 2,y are related to the expansion move algo¬ 
rithm [6] in the following sense. While [6] seeks to improve a 
single current labeling x by calculating an optimized crossover 
(fusion) with a candidate labeling y , we seek which labels can 
be moved with a guaranteed improvement to y for all possible 
labelings x. 

Algorithm 1 is similar in structure to [43]. The later finds 
an improving substitution in a small class V 1,y by incremen¬ 
tally shrinking the set of potentially persistent variables. More 
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specifically, given a test labeling y E A’, the all-to-one class of 
substitutions V 1,y contains substitutions p , which in every node 
v either replace all labels with y v or leaves all labels unchanged. 
There are only two possible choices for p v : either i i-)> y v for 
all i E X v or the identity i i. Methods [24, 42, 43] can be 
explained as finding an improving mapping in this class [35]. We 
generalize the method [43] to substitutions in V 2,y . The original 
sufficient condition for persistency in [43] does not extend to 
such substitutions. Even for substitutions in V 1,y it is generally 
weaker than condition (18) unless a special reparametrization is 
applied [44]. Criterion (18) extends to general substitutions and 
does not depend on reparametrization. Similarly to [43], we will 
use approximate dual solvers in this more general setting. 

In [35, (£-Ll)] problem (6) is formulated as one big linear pro¬ 
gram. We solve problem (6), and hence also the LP problem [35] 
in a more combinatorial fashion w.r.t. to the variables defining the 
substitution. 

It may seem that solving a series of linear programs rather 
than a single one is a disadvantage of the proposed approach. 
However, as we show further, the proposed iterative algorithm 
can be implemented using a dual, possibly suboptimal, solver for 
the relaxed verification problem (18). This turns out to be much 
more beneficitial in practice since the verification problems can be 
incrementally updated from iteration to iteration and solved overall 
very efficiently. This approach achieves scalability by exploiting 
available specialized approximate solvers for the relaxed MAP 
inference. Essentially, any dual (approximate) solver can be used 
as a black box in our method. 

5 Persistency with Dual Solvers 

Though Algorithm 1 is quite general, its practical use is limited 
by the strict requirements on the solver, which must be able to 
determine the exact support set of all optimal solutions. However, 
finding even a single solution of the relaxed problem with standard 
methods like simplex or interior point can be practically infeasible 
and one has to switch to specialized solvers developed for this 
problem. Although there are scalable algorithms based on smooth¬ 
ing techniques [31, 32], which converge to an optimal solution, 
waiting until convergence in each iteration of Algorithm 1 can 
make the whole procedure impractical. In general, we would like 
to avoid restricting ourselves to certain selected solvers to be 
able to choose the most efficient one for a given problem. In the 
standard LP relaxation (introduced below in (20)), the number of 
primal variables grows quadratically with the number of labels, 
while the number of dual variables grows only linearly. It is 
therefore desirable to use solvers working in the dual domain, 
including suboptimal ones, ( e.g . [20, 32, 12, 33], performing 
block-coordinate descent) as they offer the most performance for 
a limited time budget. Furthermore, fast parallel versions of such 
methods have been developed to run on GPU/FPGA [38, 7, 16], 
making the LP approach feasible for more vision applications. 

We will switch to the dual verification LP and gradually relax 
our requirements on the solution returned by a dual solver. This is 
done in the following steps: 

1. an optimal dual solution; 

2. an arc consistent dual point; 

3. any dual point. 

Our main objective is to ensure in each of these cases that the 
found substitution p is strictly improving, while possibly compro¬ 
mising its maximality. The final practical algorithm operating in 


the mode 3 relies on the persistency problem reduction introduced 
in § 6. Intermediate steps 1 and 2 are considered right after 
defining the standard LP relaxation and its dual. 

5.1 LP Relaxation 

We consider the standard local polytope relaxation [41, 49, 47] 
of the energy minimization problem (1) given by the following 
primal-dual pair: 

(primal) 
min(/, p) 

T,j»uv(hj) 

Puv (h j) = Pv {j ) 5 
Hu{i) = p01 
PuiS) ^ 0 } 

puv (J* 5 j ) ^ 0 •> 

f^0 = 1 

where abbreviates 


fui}) — fu{i) + f{u) Puvi' 

0 tpu 5 

(21a) 

fuv(hj) = fuv(hj) - Vuv{i) - 

< pvu(j\ 

(21b) 

= f0 f Pu- 


(21c) 


The constraints of the primal problem (20) define the local 
polytope A. The cost vector is called a reparametrization of 
/. There holds cost equivalence: (/^, p) = (/, p) for all p E A 
(as well as Ef = Ef<p), see [49]. Using the reparametrization, the 
dual problem (20) can be briefly expressed as 

max/^ s.t. (Vcu E V U £) > 0. (22) 

Note that for a feasible p the value is a lower bound on the 
primal problem (20). In what follows we will assume that p in (22) 
additionally satisfies the following normalization : min= 
0 and min^- f£ v (i,j) = 0 for all u , v, which is automatically 
satisfied for any optimal solution. 

5.2 Expressing 0* in the Dual Domain 

Let (p, p) be a pair of primal and dual optimal solutions to (20). 
From complementary slackness we know that if p v (i) > 0 then 
the respective dual constraint holds with equality: 

> 0 => fy(i) = 0, (23) 

in this case we say that f£(i) is active. The set of such active 
dual constraints matches the sets of local minimizers of the 
reparametrized problem, 

O v {tp) := {i G X v | fg(i) = 0} = argmin (24) 

i 

From complementary slackness (23) we obtain that 

o; c O v (<p). (25) 

This inclusion is insufficient for an exact reformulation of Algo¬ 
rithm 1, however it is sufficient for correctness if we make sure 
that y v H O v (p) = 0 on termination, i.e., that the substitution p 
does not displace labels in O v (p). Then, by Corollary 4.4, p E §/ 
follows. 

There always exists an optimal primal solution p and dual p 
satisfying strict complementarity [46], in which case relation (23) 
becomes an equivalence: 


(dual) 
max/| 

'•Puv (i) ^ R> 
PvU (.j) G R> 
ip u G R, 

mi) > o, 
m (ij) > o, 


( 26 ) 
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Algorithm 2: Iterative Pruning Arc Consistency 

Input: Cost vector / G R x , test labeling y G A'; 

Output: Strictly improving substitution p; 

1 (W G v) 34 := ^AW; 

2 while true 

3 Construct verification problem g := (/ — [p]) T f with p 
defined by (7); 

4 Use dual solver to find p such that g^ is arc consistent; 

5 Oy(tp) := {ie_Xv I ffff(i) = 0} ; 

6 if (\/v G V) C4((/?) fl y v = 0 then return p; 

7 for v G V do 

8 Pruning of substitutions: 34 •= 34\£4(^); 


It is the case when both g and p are relative interior points of 
the optimal primal, resp. optimal dual, faces [46]. For a relative 
interior optimal p, the set of constraints that are satisfied as 
equalities /^(i) = 0 is the smallest and does not depend on 
the specific choice of such p. Under strict complementarity, (25) 
turns into equality 0* = O v (p), which allows to compute 
the exact maximum persistency using a dual algorithm without 
reconstructing a primal solution. However finding such p appears 
more difficult: e.g. the most efficient dual block-coordinate as¬ 
cent solvers [20, 12, 8, 33] only have convergence guarantees 
(see [20, 33]) allowing to find a sub-optimal solution, satisfying 
arc consistency. 

Definition 5.1 ([49]). A reparametrized problem is called arc 
consistent if: (i) for all uv G £ from f£ v (i, j ) being active follows 
that f£(i) and ff(j) are active; (ii) for all u G V from f£(i) 
active follows that for all v G A f(u) there exists a j G X v such 
that f£ v (i,j) is active. 

An optimal dual solution need not be arc consistent, but 
it can be reparametrized without loss of optimality to enforce 
arc consistency [49]. Labels that become inactive during this 
procedure are not in the support set of primal solutions. In general 
the following holds. 

Proposition 5.2. Arc consistency is a necessary condition for 
relative interior optimality: if O v (p) = O* for all v G V then 

is arc consistent. Proof in (38). 

This property is in our favor, since we are ideally interested 
in the equality O v (p ) = O *. Next, we propose an algorithm 
utilizing an arc consistent solver and prove that it is guaranteed to 
output p G § f. 

5.3 Persistency with an Arc Consistency Solver 

We propose Algorithm 2 which is based on a dual solver attaining 
the arc consistency condition (differences to Algorithm 1 under¬ 
lined). If the dual solver (in line 4) finds a relative interior optimal 
solution, Algorithm 2 solves (6) exactly. Otherwise it is suboptimal 
and we need to reestablish its correctness and termination. 

Lemma 5.3 (Termination of Algorithm 2). Algorithm 2 termi¬ 
nates in at most 'ff v {\X v \ ~ 1) iterations. 

Proof. In case the return condition in line 6 is not satisfied, O v D 
y v 0 for some v and the pruning in line 8 excludes at least one 
label from 34 • □ 


x u \yu x u \y u 




(a) (b) 

Fig. 5: Illustration for the reduction. Labels X u \y u are not displaced 
by p hence their associated unary and pairwise costs are zero in g — 
(i — [p] J )f- In case (a) the indicated pairwise costs are replaced with 
their minimum. In case (b) the value of g U v(i,j) can be decreased, 
assuming all reductions of type (a) and their symmetric counterparts 
are already performed. The amount of decrease matches the value 
of the mixed derivative (non-submodularity) associated to z, i paired 
with j,j*. 

Lemma 5.4 (Correctness of Algorithm 2). If (\/v G V) O v (p) fl 
y v = 0 holds for an arc consistent dual vector p , then p is 
optimal. Proof in (57). 

It follows that when Algorithm 2 terminates, the found arc 
consistent solution p is optimal, in which case inclusion (25) is 
satisfied and the found substitution p is guaranteed to be in Sy. 

5.4 Solvers Converging to Arc Consistency 

One can see that arc consistency is only required on termination of 
Algorithm 2. In the intermediate iterations we may as well perform 
the pruning step, line 8, without waiting for the solver to converge. 
This motivates the following practical strategy: 

• Perform a number of iteration towards finding an arc- 
consistent dual point p; 

• Check whether there are some labels to prune, i.e., 

(3 u)O u (ip) fl 34 7 ^ 0; 

• Terminate if p is arc consistent and there is nothing to 
prune; otherwise, perform more iterations towards arc con¬ 
sistency. 

If the solver is guaranteed to eventually find an arc consistent 
solution, the overall algorithm will either terminate with an arc 
consistent and (by Lemma 5.4) optimal p or there will be some 
labels to prune. However, we have to face the question what 
happens if the dual solver does not find an arc consistent solution 
in finite time. In this case the algorithm can be iterating infinitely 
with no pruning available. At the same time there is no guarantee 
that a pruning step will not occur at some point and thus if we 
simply terminate the algorithm we get no persistency guarantees. 
Even if the dual solver was guaranteed to converge in a finite 
number of iterations, it is in principle possible that the time needed 
for a pruning to succeed would be proportional to the time of 
convergence, making the whole algorithm very slow. Instead, it is 
desirable to guarantee a valid result while allowing only a fixed 
time budget for the dual solver. We will overcome this difficulty 
with the help of the reduced verification LP presented next. 

6 Verification Problem Reduction 

Algorithms 1 and 2 iteratively solve verification problems. We can 
replace the verification LP solved in step 4 by a simpler, reduced 
one, without loss of optimality of the algorithms. 
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Definition 6.1. Let g := (/ — [p]) T f be the cost vector of the 
verification LP. The reduced cost vector g is defined as 


9v(i) •= 9v(i), v eV] g 0 = 0; 

f°, 

I A vu (j) := mm i rgy u g uv (^J), 

| A uv (i) := min jt ^y v g uv (hf), 
^min{A vn (j) + A uv (i\ guvifi j)}? 


(27a) 

(27b) 

't ^ y u j ^ yv •> 

^ ui j v i 

i ^ y U5 j ^ y v •> 

i £ y^j £ 34 • 


The reduction is illustrated in Fig. 5. Taking into account that 
9uv(i',j') = 0 for i' G X u \y u , j' G A^\34> the reduction can 
be interpreted as forcing the inequality 


9uv(i,j ) T 9uv(f •) j) 9uv(i,j) 9uv(f i j ) ^ 0? (78) 


i.e., the non-negativity of mixed discrete derivatives, for all four- 
tuples i G y u ,j G y v , i' y v , f £ y u - The cost vector g is 
therefore a partial submodular truncation of g. 

Recall that Algorithm 1 on each iterations prunes all substitu¬ 
tions q < p that do not belong to Sy based on the solutions of the 
verification LP. The following theorem reestablishes optimality of 
this step with the above reduction. 

Theorem 6.2 (Reduction). Let p G V 2,y and g be the corre¬ 
sponding reduced cost vector constructed as in Def. 6.1. Let also 
q ^ q < p. Then q G §/ iff q G Sg. Proof in (55). 

From Theorem 6.2 and Corollary 4.4 it follows that q G §/ iff 
q u ((D*) = 0*, where 0* are the support sets of optimal solutions 
to the reduced verification LP , 

argmin ^(g, g.). (29) 


Procedure 3: dual 

correct (p,g) 



1 

for uv G ( 

£ do 





2 


(Vi e 

Xu) 

Puv 

(■i ) := <Puv(i) +min i . 

/ 9uv (7 

>j); 

3 


(Vi e 

Xu) 

Puv 

(i) := <Puv(i) +minj 


j); 

4 


(Vj e 

X v ) 

Pvu 

.(j) ■= <Pvu(j) + min, 

igLii 

,j); 

5 

(Vu e V) 

Pu • 

= 

„+min ip£(i); /* 

Normalize */ 

6 

return p\ 







terminate (lines 11-13). If neither occurs in a certain number of 
iterations (stopping condition in line 14), the pruning based on 
the currently active labels is executed (line 15). After that the cost 
vector g is rebuilt, but the dual solver continues from the last found 
dual point (warm start). 

The speed-ups will be explained in the next section, they 
are not critical for the overall correctness. Now we focus on 
the new termination conditions (lines 11-13). A correction step 
(line 11) is introduced whose purpose is to move the slacks from 
pairwise terms to unary terms so that active labels become more 
decisive. This procedure is defined in Procedure 3. The correction 
is not intermixed with dual updates but serves as a proxy between 
the solver and the termination conditions. It has the following 
property. 

Lemma 7.1. Output p of Procedure 3 is feasible and satisfies 

(Vu € V) min g%(i) = 0, (30) 

iClXu 

(Vuv € £,ij e X uv ) min g* v (i',j) = min {% v (i,j') = 0. (31) 

Moreover, if the input p is feasible, the lower bound does not 
decrease. 


Therefore it is valid for algorithms 1 and 2 to consider this 
reduced LP and prune all substitutions q that do not satisfy 
the property q v ((D*) = 0*. The optimal relaxed solutions and 
their support sets can in general differ from those of the original 
verification LP, however for the purpose of the algorithm it is 
an equivalent replacement potentially affecting only the order in 
which substitutions are pruned. 

The reduction has the following advantages: 

• subsets of labels X v \y v can be contracted to a single rep¬ 
resentative label y v , because associated unary and pairwise 
costs are equal; 

• It will allow (see § 7) to relax the requirements on ap¬ 
proximate dual solvers needed to establish termination and 
correctness of the algorithm. 

• It is useful for the speed up heuristics (§ 8). In particular, 
it is easier to find a labeling with a negative cost since we 
have decreased many edge costs. It will be shown that such a 
labeling allows for an early stopping of the dual solver and a 
pruning of substitution without loss of maximality. 

7 Persistency with a Finite Number of Dual 
Updates 


Proof. Line 2 of Procedure 3 moves a constant from an edge 
to node. This turns the minimum of terms gfi v (i,j) to zero. 
Lines 3 and 4 turn to zero the minimal pairwise value attached 
to each label, which provides (31). Line 5 provides (30). In case 
of feasibility of the initial p, which implies g^ > 0, all values of p 
can only increase during steps 2-4 and hence the unary potentials 
gf remain non-negative. Therefore step 5 can not decrease the 
lower bound value fg. □ 

According to Lemma 7.1 Procedure 3 can not worsen the lower 
bound attained by a dual solver. The following theorem guarantees 
that when no further pruning is possible, the corrected dual point 
constitutes an optimal solution, ensuring persistency. 

Theorem 7.2. Let p be a dual point for reduced problem g 
satisfying (30)-(31). Then either 

1. g% = 0, p is dual optimal and S(y) is primal optimal, or 

2 . (3uev)o u (p)ny u ^0. 

Proof. Assume (b) does not hold: (\/u G V) O u (p) C X u \y u . 
Let us pick in each node u a label z u G O u (p). As ensured 
by (31), for each edge uv there is a label j G X v such that 
gf v ( z uj) = 0 and similarly, there exists i G X u such that 
9%v(L z v) = 0. By partial submodularity of g, we have 


We assume that a suboptimal dual solver is iterative and can be 
represented by a procedure dual_updat e, which given a current 
dual point p makes a step resulting in a new dual point and a guess 
of a primal integer solution x. 

In this setting we propose Algorithm 4. In its inner loop, the 
algorithm calls dual_update (line 7) checks whether a speed¬ 
up shortcut is available (line 8) and verifies whether it can already 


gL( z u,z v ) + g% v (i,j) < gL( z u,j) + gt v {h z v) = o. (32) 

Therefore, g% v (z u ,z v ) < -g% v (i,j) < 0. Hence gZ v (z u ,z v ) = 

0 and it is active. Therefore S(z) and dual point p satisfy 
complementarity slackness conditions and hence they are primal- 
dual optimal and g% = Eg(z) = 0 = Eg(y). □ 
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Algorithm 4: Efficient Iterative Pruning 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 


Input: Problem / E R x , test labeling y E X; 

Output: Improving substitution p E D §/; 

(Vix G V) := X u \{y u }' 

Set to the initial dual solution if available; 

while true 

Apply single node pruning ; /* speed-up */ 

Construct reduced verification LP g from / and current 
sets y u , according to Definition 6.1; 

repeat 

(p,x) := dual_update(g, <p); 
if Eg {pc) < 0 then /* speed-up */ 

Apply pruning cut with x; 
goto step 4 to rebuild g; 


n 

12 

13 

14 

15 


/* Verification of Optimality */ 

:= dual_correct (g, (/?); 

:= f | (i) = o}; 

if (Vix E V) O u fl y u = 0 then return p by (7); 
until any stopping condition (e.g., iteration limit); 
Prune: (Vu € V) y u ■= y u \O u \ 


16 Procedure dual_update (g,p) 

Input: Cost vector g , dual point (/?; 

Output: New dual point p, approximate primal integer 
solution x; 


Theorem 7.3 (Termination and Correctness of Algorithm 4). For 
any stopping condition in line 14, Algorithm 4 terminates in at 
most J2 V (\X V \ ~ 1) outer iterations and returns p G §/. 

Proof. When the algorithm has not yet terminated some further 
pruning is guaranteed to be possible (compare conditions in 
lines 13 and 15). The iteration limit follows. When Algorithm 4 
terminates, from Theorem 7.2 it follows that p' is dual optimal 
and hence O u (p f ) D 0*. Therefore, (\/u G V) 0* fl y u = 0, 
which is sufficient for p to be strictly A-improving according to 
Corollary 4.4. □ 

In [40] we prove that a similar result holds for a TRW- 
S iteration without correction by arguing on complete chain 
subproblems instead of individual nodes. The correction might be 
needed in case the algorithm does not keep slacks on the nodes, 
e.g. for SRMP [22]. 

The stopping condition in line 14 of Algorithm 4 controls the 
aggressiveness of pruning. Performing fewer iterations may result 
only in the found p not being the maximum, but in any case it 
is guaranteed that the Algorithm 4 does not stall and identifies 
a correct persistency. In the case when the solver does have 
convergence and optimality guarantees, the time budget controls 
the degree of approximation to the maximum persistency. 

8 Speed-ups 

8.1 Inference Termination Without Loss of Maximality 

Next, we propose several sufficient conditions to quickly prune 
some substitutions without worsening the final solution found 
by the algorithm. As follows from Definition 3.2, existence of 
a labeling x such that ((/ — [p]) T /, S(x)) < 0 and x p{x) is 
sufficient to prove that substitution p is not strictly A-improving. 
Hence one could consider updating the current substituttion p 
without waiting for an exact solution of the inference problem in 
line 4. The tricky part is to find labels that can be pruned without 


loss of optimality of the algorithm. Lemma 8.1 below suggests 
to solve a simpler verification LP, min^A' (<7, p) over a subset 
A' of A. This does not guarantee to remove all non-improving 
substitutions (which implies one has to switch to A afterwards), 
but can be much more efficient than the optimization over A. After 
the lemma we provide two examples of such efficient procedures. 

Lemma 8.1. Let p G V 2,y and g be defined by (27) (depends on 
p). Let q G §/ nV 2,y , q < p, Q = [q\. Let A' C A, Q(A') C A' 
and 0* = argmin^A/ (g, fi). Then (Vv G V) g v (0*) = 0*. 

Proof in ?? . 

Note, while Theorem 6.2 is necessary and sufficient for prun¬ 
ing, Lemma 8.1 is only sufficient. 

8.2 Pruning of Negative Labelings 

Assume we found an integer labeling x such that Eg(x ) < 0 and 
p(x) x. Lemma 8.1 gives an answer, for which nodes v the 
label x v can be pruned from the set y v without loss of optimality. 
Define the following restriction of the poly tope A: 

Aa: = {m e A I (Vv e V) fj,(y v ) + //(.r,.) = 1} C A. (33) 

Polytope A^ corresponds to the restriction of A to the label set 
{Uv, x v} in eac h node v E V. According to Lemma 8.1 we need 
to solve the problem 

O* := argmin^ eAa! (g, g) (34) 

and exclude x v from y v if x v E 0* . Due to the partial submod¬ 
ularity of g the problem (34) is submodular and can be solved by 
min-cut/max-flow algorithms [23]. Because x was found to have 
non-positive energy, it is necessarily that for some nodes v there 
will hold x v G O* H y v and therefore some pruning will take 
place. 


8.3 Single Node Pruning 

Let us consider “a single node” polytope A u ^ := {p G A | 
VuiVu) + Putt) = 1; (Vv ± u ) p v (y v ) = 1}. It is a special 
case of A x when y and x differ in a single node u only and 
x u = i. In this case problem (34) amounts to calculating g u (x u ) + 
J2veAf(u) 9uv(x u , Vv) • If this value is non-positive, x u must be 
excluded from y u . The single node pruning can be applied to all 
pairs (u,i) exhaustively, but it is more efficient to keep track of 
the nodes for which sets y v have changed (either due to a negative 
labeling pruning, active labels pruning in line 15 or the single node 
pruning itself) and check their neighbors. 

8.4 Efficient Message Passing 

The main computational element in dual coordinate ascent solvers 
like TRWS or MPLP is passing a message , i.e., an update of 
the form min iex u (f uv (hj) + a(i)). In many practical cases the 
message passing for / can be computed in time linear in the 
number of labels [15, 28, 3]. This is the case when f uv is a 
convex function of i—j (e.g., \i—j\, (i—j) 2 ) or a minimum of 
few such functions (e.g. Potts model is min(l, \i—j\)). However, 
in Algorithm 1 we need to solve the problem with the cost vector 
g = (I — P T )f , resp. g (27) if we apply the reduction. It turns 
out that whenever there is a fast message passing method for /, 
the same holds for g. 
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Our-CPLEX 

Our Algorithm 1 (Iterative Relaxed Inference) us¬ 
ing CPLEX [17]. 

Our-TRWS 

Our Algorithm 4 using TRW-S [20]. Initial solu¬ 
tion uses at most 1000 iterations (or the method 
has converged). All speedups. 

[43]-CPLEX 

Method [43] with CLPEX. 

[43]-TRWS 

Method [43] with TRW-S. 

e-Ll [35] 

Single LP formulation of the maximum strong 
persistency [35] solved with CPLEX. 

Kovtun[24] 

One-against-all method of Kovtun [25]. 

MQPBO 

Multilabel QPBO [19]. 

MQPBO-10 

MQPBO with 10 random permutations, accumu¬ 
lating persistency. 


TABLE 1: List of Evaluated Methods 


Theorem 8.2 (Fast message passing). Message passing for an 
edge term g uv (27) can be reduced to that for f uv in time 0(1341 + 

134|). Proof in Table 6. 

This complexity is proportional to the size of the sets 34- The 
more labels are pruned from sets 34 in the course of the algorithm, 
the less work is required. 

Note, that contrary to limiting the number of iterations of 
a dual solver, described in § 7, the speedups presented in this 
section do not sacrifice the persistence maximality (6). In our 
experiments for some instances, Algorithm 4 finished before ever 
reaching step 15. In such cases the found substitution p G §/ is 
the maximum. 

9 Experimental Evaluation 1 

In the experiments we study how well we approximate the 
maximum persistency [35], Table 2; illustrate the contribution of 
different speedups, Table 4; give an overall performance compar¬ 
ison to a larger set of relevant methods, Table 3; and provide 
a more detailed direct comparison to the most relevant scalable 
method [43] using exact and approximate LP solvers, Table 5. As a 
measure of persistency we use the percentage of labels eliminated 
by the improving substitution p\ 

E vGV I x v \Pv(x v )\ _ E^vIT.I 

E.evd^M) - E^vd'+d-i)' { } 

9.1 Random Instances 

Table 2 gives comparison to [43] and [35] on random instances 
generated as in [35] (small problems on 4-connected grid with 
uniformly distributed integer potentials for “full” model and of 
the Potts type for “Potts” model, all not LP-tight). It can be 
seen that our exact Algorithm 1 performs identically to the s-Ll 
formulation [35]. Although it solves a series of LPs, as opposed 
to a single LP solved by s-Ll, it scales better to larger instances. 
Instances of size 20x20 in the e-Ll formulation are already too 
difficult for CPLEX: it takes excessive time and sometimes returns 
a computational error. The performance of the dual Algorithm 4 
confirms that we loose very little in terms of persistency but gain 
significantly in speed. 


lr The implementation of our method is available through http://cmp.felk. 
cvut.cz/~shekhovt/persistency 


9.2 Benchmark Problems 

Table 3 summarizes average performance on the OpenGM MRF 
benchmark [18]. The datasets include previous benchmark in¬ 
stances from computer vision [45] and protein structure predic¬ 
tion [29, 51] as well as other models from the literature. Results 
per instance are given in § B. 

9.3 Speedups 

In this experiment we report how much speed improvement was 
achieved with each subsequent technique of § 8. The evaluation 
in Table 4 starts with a basic implementation (using only a warm 
start). The solver is allowed to run at most 50 iterations in the 
partial optimality phase until pruning is attempted. We expect that 
on most datasets the percentage of persistent labels improves when 
we apply the speedups (since they are without loss of maximality). 

9.4 Discussion 

Tables 2 and 5 demonstrate that Our-TRWS, which is using 
a suboptimal dual solver, closely approximates the maximum 
persistency [35]. Our method is also significantly faster and scales 
much better. The method [43] is the closest contender to ours in 
terms of algorithm design. Tables 2, 3 and 5 clearly show that 
our method determines a larger set of persistent variables. This 
holds true with exact (CPLEX) as well as approximate (TRWS) 
solvers. There are two reasons for that as discusssed in § 4.3. 
First, we optimize over a larger set of substitutions than [43], 
i.e., we identify per-label persistencies while [43] is limited to 
the whole-variable persistencies. Second, even in the case of the 
whole-variable persistencies the criterion in [43] is in general 
weaker than (18) and depends on the initial reparametrization 
of the problem. This later difference does not matter for Potts 
models [44], the examples Figs. 6 and 8, but does matter, e.g., in 
Fig. 9. Although our method searches over a significantly larger 
space of possible substitutions, it needs fewer TRW-S iterations 
due to speedup techniques. Details on iteration counts can be 
found in § B. In the comparison of running time it should be 
taken into account that different methods are optimized to a 
different degree. Nevertheless, it is clear that the algorithmic 
speedups were crucial in making the proposed method much more 
practical than [43] and [35] while maintaining high persistency 
recall quality. 

To provide more insights to the numbers reported, we illustrate 
in Figs. 1 and 6 to 9 some interesting cases. Fig. 6 shows “the 
hardest” instance of color-seg-n4 family. Identified persis¬ 
tencies allow to fix a single label in most of the pixels, but for 
some pixels more than one possible label remains. The remainder 
of the problem has the reduced search space p{X), which can be 
passed to further solvers. The t sukuba image Fig. 1 is interesting 
because it has appeared in many previous works. The performance 
of graph-cut based persistency methods relies very much on strong 
unary costs, while the proposed method is more robust. Fig. 7 
shows an easy example from ob ject-seg, where LP relaxation 
is tight, the dual solver finds the optimal labeling y and our 
verification LP confirms that this solution is unique. In Fig. 8 
we show a hard instance of mrf-stereo. Partial reason for its 
hardness is integer costs, leading to non-uniqueness of the optimal 
solution. In Fig. 9, photomontage/pano instance, we report 
79% solution completeness, but most of these 79% correspond to 
trivial forbidden labels in the problem (very big unary costs). At 
the same time other methods perform even worse. This problem 
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Problem family 

[43] 

-CPLEX 

[43] 

-TRWS 

5—LI [35] 

Our- 

-CPLEX 

Our- 

-TRWS 

10x10 

10x10 

Potts-3 

full-3 

0.18s 

0.24s 

58.46% 

2.64% 

0.05s 

0.09s 

58.38% 

1.22% 

0.05s 

0.06s 

72.27% 

62.90% 

0.18s 

0.24s 

72.27% 

62.90% 

0.04s 

0.05s 

72.21% 

62.57% 

20x20 

20x20 

Potts-3 

full-3 

3.25s 

2.81s 

73.95% 

0.83% 

0.21s 

0.37s 

68.49% 

0.83% 

0.87s 

0.95s 

87.38% 

72.66% 

2.43s 

3.03s 

87.38% 

72.66% 

0.06s 

0.07s 

87.38% 

72.31% 

20x20 

20x20 

Potts-4 

full-4 

12.45s 

3.96s 

23.62% 

0.01% 

0.39s 

0.39s 

18.43% 

0.01% 

19.40s 

21.08s 

74.28% 

6.58% 

8.56s 

12.41s 

74.28% 

6.58% 

0.08s 

0.08s 

73.63% 

6.58% 


TABLE 2: Performance evaluation on random instances of [35]. For each problem family (size, type of potentials and number of labels) 
average performance over 100 samples is given. To allow for precise comparison all methods are initialized with the same test labeling y found 
by LP relaxation. Our-TRWS closely approximates Our-CPLEX, which matches e-Ll [ 35 ], and scales much better. 


Problem family 

#1 

#L 

#V 

MQPBO 

MQPBO 

-10 

Kovtun[24] 

[43] 

-TRWS 

Our 

-TRWS 

mrf-stereo 

3 

16-60 

> 100000 


t 

t 


1 .0s 

0.23% 

2.5h 

13% 

117s 

73.56% 

mrf-photomontage 

2 

5-7 

< 514080 

93s 

22% 

866s 

16% 

0.4s 

15.94% 

3.7h 

16% 

483s 

41.98% 

color-seg 

3 

3-4 

< 424720 

22s 

11% 

87s 

16% 

0.3s 

98% 

1.3h 

99.88% 

61.8s 

99.95% 

color-seg-n4 

9 

3-12 

< 86400 

22s 

8% 

398s 

14% 

0 .2s 

67% 

321s 

90% 

4.9s 

99.26% 

ProteinFolding 

21 

< 483 

< 1972 

685s 

2% 

2705s 

2% 

0 .02s 

4.56% 

48s 

18% 

9.2s 

55.70% 

object-seg 

5 

4-8 

68160 

3.2s 

0.01% 

t 


0 .1s 

93.86% 

138s 

98.19% 

2.2s 

100 % 


TABLE 3: Performance on OpenGM benchmarks. Columns #I,#L,#V denote the number of instances, labels and variables respectively. For 
each method an average over all instances in a family is reported, f - result is not available (memory / implementation limitation). 


Instance 

Initialization 


Extra time for persistency 



(1000 it.) 

no speedups 

+reduction 

-i-node pruning 

+labeling pruning 

-i-fast msgs 

Protein folding 1CKK 

8.5s 

268s (26.53%) 

168s (26.53%) 

2.0s (26.53%) 

2.0s (26.53%) 

2.0s (26.53%) 

colorseg-n4 pfau-small 

9.3s 

439s (88.59%) 

230s (93.41%) 

85s (93.41%) 

76s (93.41%) 

19s (93.41%) 


TABLE 4: Evaluation of speedups on selected examples: computational time drops, as from left to right we add techniques described in § 6. 
1CKK: an example when the final time for persistency is only a fraction of the initialization time, pfau-small: an example when times for 
initialization and persistency are comparable; speedups also help to improve the persistency as they are based on exact criteria. 


Instance 

#L 

#V 

[43]- 

-CPLEX 

[43]- 

-TRWS 

Our-CPLEX 

Our-TRWS 

1CKK 

< 445 

38 

2503s 

0% 

46s 

0% 

2758s 

27% 

8.5+2s 26.53% 

1CM1 

< 350 

37 

2388s 

0% 

51s 

0% 

4070s 

34% 

9+3.9s 29.97% 

1SY9 

< 425 

37 

1067s 

0% 

67s 

0% 

2629s 

51% 

ll+4.2s 57.98% 

2BBN 

< 404 

37 

9777s 

0% 

5421s 

0% 

9677s 

9% 

16+4.3s 14.17% 

PDB1B25 

< 81 

1972 

325s 

22% 

120s 

22% 

1599s 

84% 

4.3+7.3s 87.84% 

PDB1D2E 

< 81 

1328 

483s 

59% 

83s 

59% 

154s 

98% 

1.6+1.8s 98.25% 


TABLE 5: Comparison to [43] using exact and approximate LP solvers. Examples of hard ProteinFolding instances [29, 51]. For 
Our-TRWS the initialization + persistency time is given. An occasionally better persistency of Our-TRWS vs. Our-CPLEX is explained 
by different test labelings produced by the CPLEX and TRW-S solvers (unlike in Table 2). The results of e- LI [35] wold be identical to 
Our-CPLEX, as has been proven and verified on random instances. 

has hard interaction constraints. It seems that hard constraints 
and ambiguous solutions pose difficulties to all methods including 
ours. 

10 Conclusions and Outlook 

We presented an approach to find persistencies for an exp-APX- 
complete problem employing only solvers for a convex relax¬ 
ation. Using a suboptimal solver for the relaxed problem, we 
still correctly identify persistencies while the whole approach 
becomes scalable. Our method with an exact solver matches 
the maximum persistency [35] and with a suboptimal solver 
closely approximates it, outperforming state of the art persistency 
techniques [43, 19, 25]. The speedups we have developed allow to 
achieve this at a reasonable computational cost making the method 
much more practical than the works [35, 43] we build on. In fact, 
our approach takes an approximate solver, like TRW-S, and turns 
it into a method with partial optimality guarantees at a reasonable 
computational overhead. 


We believe that many of the presented results can be extended 
to higher order graphical models and tighter relaxations. Practical 
applicability with other approximate solvers can be explored. A 
further research direction that seems promising is mixing different 
optimization strategies such as persistency and cutting plane 
methods. 
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that our partial labeling (the part with one label remaining) is larger. 
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Appendix A 
Proofs 

Proofs of the Generic Algorithms 

Proposition 4.3. For a given substitution p, let O* C A denote the 
set of minimizers of the verification LP (18) and 

O : := {iex v I (3p G O*) p v {i) > 0}, (19) 

which is the support set of all optimal solutions in node v. Then 

p G S/ iff (Vu G V Vi G Ol) p v (i ) = i. 

Proof. Direction =>. Letp G §/. Assume for contradiction that (3v G 
V 3i G Oy) p v (i ) / L Since i G (9* there exists p G (D* such that 
/z v (i) > 0. Its image p — [p]p has /x v (i) = 0 due to p v (i) ^ i by 
evaluating the extension (10). This contradicts [p\p = p. 

Direction <=. Now let (Vu G V Vi G O*) p v (i) = i- Clearly, 
\p]p = p holds for all p on the support set given by (C9* | v G V), 
hence for O*. It remains to show that the value of the minimum 
in (18) is zero. For p £ O* we have \p\p = p and the objective 
in (18), ((/ - [p]) T /, At) = (f,P~ \p\p) vanishes. □ 

Theorem 4.6. Substitution p returned by Algorithm 1 is the maxi¬ 
mum of S/ fl V 2,y and thus it solves (6). 

Proof The two following lemmas form a basis for the proof. 

Lemma A.l ([36], Thm. 3(b)). Let q G §/; q < p and let O * = 
argmin /LtGA ((/ — [p]) T /, p), i-e-, as in line 4 of Algorithm 1. 

Then (V/z G C7*) [g]/x = p. 

In the case of substitutions from the class V 2,v , the statement 
additionally simplifies as follows. 

Corollary A.2. Assume conditions of Lemma A. 1 and additionally, 

let q G V 2 ' v and 0% := {i G X v \ (3p G O*) p v (i ) > 0} then 

(Vv G V,Vi G Oy) g v (i) = i. (36) 

Proof It follows similarly to Corollary 4.4. Assume for contradiction 
that (3v G V 3i G O*) q v (i) / L Since i E O* there exists 
p G O* such that /x v (z) > 0. It’s image p — [q]p has p v (i) — 0 
due to q v {f) ^ i by evaluating the extension (10). This contradicts to 
[q]p = p, the statement of Lemma A.l. □ 

Lemma A.3. Let p t denote the substitution p computed in line 3 of 
Algorithm 1 on iteration t. The algorithm maintains the invariant that 

(\/q G §/ fl V 2,y ) q < p*. 

Proof We prove by induction. The statement holds trivially for the 
first iteration. Assume it is true for the current iteration t. Then for 
any gG§/fl V 2,y holds q<p l and therefore Corollary A.2 applies. 
We can show that line 8 only prunes substitutions that are not in 
§/ n V 2,y as follows. 

Let p t+1 be the substitution on the next iteration, i.e. computed 
by line 3 after pruning line 8. 

Assume for contradiction that 3q G S>fP\V 2,y such that q p t+1 . 
By negating the definition and expanding, 

(3* e V) pl +1 {X v ) % q v (X v ), (37a) 

&{3v€V3i€X v ) i€pi +1 {X v ) A i & q v (X v ), (37b) 
<S> (3w € V 3 i E Xv) pt +1 (*) = * A q v (i) 7 ^ i. (37c) 

If i was pruned in line 8, i G O*, then it must be that q v (i) — i , 
which contradicts to (37c). Therefore 

(3veV3iex v \o:) P t v +1 (i) = i A q v (i)^i. (38) 

However, in this case pl +1 (i) = p f v {'i) = i and q < p’ fails to hold, 
which contradicts to the assumption of induction. Therefore q <p t+1 
holds by induction on every iteration. □ 

By Proposition 4.5 the algorithm terminates and returns a substitu¬ 
tion in §/ fl V 2,y . By Lemma A.3 the returned substitution p satisfies 
p > q for all q G §/ fl V 2,y . It is the maximum. □ 


Proposition 5.2. Arc consistency is a necessary condition for relative 
interior optimality: if O v (p) = O* for all v G V then is arc 
consistent. 

Proof Condition O v (ip) = O* implies that p satisfies strict comple¬ 
mentarity with some primal optimal solution p. The strict comple¬ 
mentarity implies that (Vi G X v ) (f%(i) = 0 p u (f) > 0). By 
feasibility of p, there must hold (\/v G J\f(u)) (3j G X v ) p uv (fj) > 
0. And by using complementary slackness again, it must be that 
fuv(h j) — o. Similarly, the second condition of arc consistency is 
verified. It follows that is arc consistent. □ 

Proofs of the Reduction 

The proof of the reduction Theorem 6.2 and Lemma 8.1 (used in 
speed-up heuristics) requires several intermediate results. Recall that 
a correct pruning can be done when we have a guarantee to preserve 
all strictly improving substitutions q, assuming q < p. Therefore 
statements in this section are formulated for such pairs. We will 
consider adjustments to the cost vector that preserve the set of strictly 
improving substitutions. These adjustments do not in general preserve 
optimal solutions to the associated LP relaxation. 

Lemma A.4. Let q < p. Then q G §/ iff q G E> g for g = (/ — [p]) T /. 

Proof Let Q — [q\, P = [p\. Since q < p there holds PQ = P. It 
implies (/ — P)(I — Q) = (/ — Q). Therefore, 

(g, ( I - Q)p) = ((/ - P) J f, (7 - Q)p) (39) 

= (/, (7 - P)(7 - Q)p) = (f, (7 - Q)p). (40) 

Assume p E A is such that Qp / p. Equality (39) ensures that 
(g, (/ — Q)p) >0 iff (/, (/ — Q)p) > 0. The theorem follows from 
definition of Sf,S g . □ 

To reformulate the condition g G §/ we will use the following 
dual characterization. 

Theorem A.5 (Characterization [36]). Let P = \p\. Then 

(V/x e A) (/, Pp) < (/, p) (41) 

iff there exists a reparametrization p such that 

p J r < r- ( 42 ) 

The following lemma assumes an arbitrary substitution q , not 
necessarily in V 2,y and takes as input sets U u that are subsets 
of immovable labels. In the context of Theorem 6.2, we will use 

u u = x u \y u . 

Lemma A.6 (Reduction 1). For a substitution q let U u C {i G X u \ 

q u (i) — i } for all u G V. Let g U v(hj) = 0 for all (z, j) G U u x U v 

and let g be defined by 


g« = g v , v e V; 


(43a) 

Quvihj) = | 

f min g U v(i',j), 
i eu u 

min g uv (i,j'), 

i G Uu-, j U v , 

i ^ U u -) j G U v -, 

(43b) 

j' E U v 

l guv{f i), 

otherwise. 


Then q G Sg iff q G S^. 




Proof Direction <=. Let us verify the following inequality: 

(Vi) ^ X U v) 9uv(qu(i),q v (j)) 9uv(qu(i), qv{j)) 

< guv(i,j) - guv(i,j)- (44) 

We need to consider only cases where g U v(i,j) ^ 9uv(i,j)- Let 
i G U u and j £ U v (the remaining case is symmetric). In this case 
q u (i) = i- Substituting g we have to prove 

9uv(i,q v (j)) ~ min g uv {i\ 9v(j)) (45) 

i'eu u 

<9uv(iJ) ~ min g uv (i',j). 

i'eu u 

The left hand side is zero because all the respective components of 
g are zero by assumption. At the same time the right hand side is 
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non-negative since i G U u - The inequality (44) holds. It implies (by 
multiplication with pairwise components of g and using the equality 
of unary components of g and g) that 

(Va* e A) (g, Qn) - (g, Qg) < (g, g) - (g, g), (46) 

where Q — [q\. Note, cost vector g satisfying (46) is called auxiliary 
for g in [25, 34]. Inequality (46) is equivalent to 

(<?, {I - Q)n) < {9, (I - Q)n)- (47) 

Whenever the left hand side of (47) is strictly positive then so is the 
right hand side and therefore from g G follows q G S g . 

Direction =K Assume q G S g . By Theorem A.5, there exist dual 
multipliers p such that g* verifies inequality (42), in components: 

(Vm € V, Vi € X u ) g%(q u (i)) < 9u{i)', (48) 

(\/uv e £, Vij e X uv ) gZ v {qu(i),qv(j)) < 

Let us expand the pairwise inequality in the case i G U u , j £ U v . Let 
q v (j) = j* ■ Using q u {f) = i we obtain 

9uv(i,j*) - <Puv(i) ~ <Pvu(j*) < 9uv(i,j) ~ <Puv(i) ~ <Pvu(j)] 

9uv(i,j*) ~ <Pvu(j*) < 9uv(iJ) - p vu (j)- (49) 

We take min over i G U u of both sides: 

min g uv (i,j*) ~ <Pvu(j*) < min g uv (i,j) - <Pvu(j)- (50) 

i£U u i£U u 

Finally we subtract p uv ( i ) on both sides and obtain 

9uv{i,j*) < 9uv(i,j)- (51) 

The case when i £ U u , j G U v is symmetric. In the remaining cases, 

9uv{i,j) = g(i,j)-<Puv(i)-‘PvuU) = gihfi-Vuvitf-VvuU) = 
9^(1, j). In total, g* satisfies all component-wise inequalities that 
does g* in (48). By Theorem A.5, 

(V^t e A) (g, Qg) < (g, g). (52) 

We have shown that (g, (/ — Q)g) > 0. It remains to prove that 
the inequality holds strictly when Qg / g. Since q G S^, there 
holds (g,g) < (Q J g,g). It is necessary that at least one of unary 
or pairwise inequalities (48) from the support of g holds strictly in 
which case inequality (52) is also strict. □ 

Lemma A.7 (Reduction 2). For a substitution q and cost vector g let 
g — g — A + , where A + G M+ has zero unary components and its 
pairwise components read: 

A t v (i,j) = max{0, g uv (i,j) + g U v(qu(i),qv(j)) ( 53 ) 

— g U v(i, qv(j)) ~ 9uv(qu(i), j )}• 

Then q G §> g iff q G §> g . 

Proof. The scheme of the proof is similar to Lemma A.6. The unary 
components of g and g are equal. If we show inequality (44), the 
implication q G §> g => q G §> g will follow as in Lemma A.6. For our 
g, inequality (44) reduces to 

&iv(qu(i),qv(j)) < A t v (i,j) (54) 

and due to idempotency of q the left hand side is identically zero. 
Therefore inequality (44) is verified. 

Direction =>. Assume q G E> g . By Theorem A.5, there exist dual 
multipliers ip satisfying inequalities (48). Consider 

g v = (g-A+) v =g* -A+. (55) 

Let us show that component-wise inequalities (48) hold for g*. 
Clearly they hold for unary components and for pairwise components 
where Af v (i,j) = 0. Let uv G S and A > 0. Let i — q u {f) 
and j' — q u (j ). It must be that % % and j’ ^ j. Let us denote 

a = g£ v [i',f)> b = gZ v (i',j ), c = gl v (i,j') and d = g* v (i,j). 

Let d := g£ v (i, j) - A+,(i, j) = d-(a + d-b-c) = b + c-a. 
From (48) we have that a <b,c, d. It follows that 2a < b + c or a < 


6 + c — a d. We proved that gf v (q u (i), q v (j)) < 9uv(hj)- In total, 
g^ satisfies all component-wise inequalities, same as g* in (48). By 
Theorem A.5, it follows that (g, (/ — Q)g) > 0. The strict inequality 
in case Qg g is considered similarly to Lemma A.6. □ 

Theorem 6.2 (Reduction). Let p G V 2,y and g be the corresponding 
reduced cost vector constructed as in Def. 6.1. Let also q G V 2,v , 
q < p. Then q G §/ iff q G E> g . 

Proof Let g = (/ — P) T /. By Lemma A.4, q G §/ iff q G §<?. We 
need to consider only pairwise terms. Let uv G S. Since q < p, if 
p u (f) — i then necessarily q u {f) M L Let p be defined using sets y u 
as in (7). The reduction g in (27) will be composed of reductions by 
Lemma A.6 and Lemma A.7. 

From g = (I —P) T f we have that for i G X u \y u and j G X v \y v 
9uv(i,j) — 0. Conditions of Lemma A.6 are satisfied with U u = 
Xu\y u - We obtain part of the reduction (27) for cases when i £ y u 
or j £ y v . Let us denote the reduced vector g' . Applying Lemma A.7 
to it, we obtain g as defined in (27). □ 

Lemma 8.1. Let p G V 2,y and g be defined by (27) (depends on p). 
Let q G §/ fl V 2,y , q < p, Q = [q\. Let A 7 C A, Q(A') C A 7 and 
O * = argmin /iGA / (g, g). Then (Vv G V) q v (0*) = O*. 

Proof Let g G O* . Assume for contradiction that Qg g. In this 
case, by Theorem 6.2, we have that (g, Qg) < (g, g). Since g G A 7 
and Q(A 7 ) C A there holds Qg G A 7 . It follows that Qg is a feasible 
solution of a better cost than g which contradicts optimality of g. It 
must be therefore that Qg = g. The claim q{Ol) — O* follows. □ 

Termination with arc consistent Solvers 

Theorem A.8. Consider the verification LP defined by g — (/ — 
P J )f. Let g^ be an arc-consistent reparametrization and let 34 = 
{i | p(i) / i}. Then at least one of the two conditions is satisfied: 

(a) g% — 0 and p is dual optimal; 

(b) (Bug V) O u (p)ny u ^0. 

Proof Assume (b) does not hold: (\/u G V) O u (p) Q Xu\y u - For 
each node u let us chose a label z u G O u (p)- By arc consistency, 
for each edge uv there is a label j G O v (p) C X v \y v such that 
9uv( z u,j ) is active and similarly, there exists i G O u (p) C X u \y u 
such that g£ v (i, z v ) is active. 

By construction, g U v(i',j') — 0 for all i'j' G X uv \y uv and 
therefore the following modularity equality holds: 

9%v(zu,z v )+g% v (i,j) (56) 

— (0 Puv(Zu') Puv(Zv')') T (0 Puv (f) Puv (j)) 
9uvi, Z Ul j) H“ 9uv(fl Z V ) • 

From gf v (z u ,j) being active we have 

9uv( z uij) < 9uv(hj)- (57) 

By adding (56) and (57) we obtain gf v (. z u,z v ) < gf v (f z v) and 
hence (z u ,z v ) is active. Therefore S(z) and dual point p satisfy 
complementarity slackness and hence they are primal-dual optimal 
and gfj = E g (z) = 0. □ 

Lemma 5.4 (Correctness of Algorithm 2). If (Vu G V) O v (p) fl 
y v — 0 holds for an arc consistent dual vector p, then p is optimal. 

Proof Corollary from Theorem A. 8. □ 

Fast Message Passing 

Theorem 8.2 (Fast message passing). Message passing for an edge 
term g uv (27) can be reduced to that for f uv in time 0(|34| + |34|)- 

Proof The components of the reduced problem g (27) can be ex¬ 
pressed directly in components of / as in Table 6. Passing a message 
on edge uv amounts to calculating p vu (j) ■= min^^ [a(z) + 
g uv (fj)\ for some vector a G . For j y v , substituting pairwise 
terms of g , it expands as 

Pvu(j) := mini G ^ u [a(i) + A uv (i)]. (58) 
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i^yu 

i e y u 

gu{i) = 0; 

9u(i) = fu(i) - fu{Vu); 

i&u, j 

QuviJ'’) j') = 0, A uv (i) ;= A vu (j) i— 0, 

i$yu, j cD'T’ 

Avu(j) ■■= vaini^y u [f U v(i',j) - fuv(i',Vv)\, 

9uv{i,j) = A vu (j); 

^E^Vn, j(ty v 

A uv (i) := min j'^y v \_fuv{^ij ) fuvfyuij )] 5 
Suvi^j) = A uv (i ); 

iey u ,jey v 

guv (i, j) = min |/ U u (i, j) ~ fuv ijju , yv ), 

A vu (j) -\~ A uu (i)|. 


TABLE 6: Components of the Reduced Verification Problem 


Since the message is equal for all j £ y v , it is sufficient to represent 
it by (p V u(y v ) (recall that y v £ y v ). For j E y v , substituting pairwise 


terms of g and denoting c = f U v(y u , y v ), 

<Pvu{j) := min { min ^y u a(i) + A vu (j), (59) 

min^^ ^ci(i) T~ min |/uv (^i) c, Au^(i) H- A vu 0')}]} 

= min { min,^ ait) + A uv (j), (60a) 

min ie y„ [a(i)+fuv(i,j)] - c, (60b) 

min, e y u [a(i) + A uv (0] + A vu (j)} (60c) 

Adding A uv (i) inside (60a) (it is zero for i £ y u ) and grouping (60a) 
and (60c) together, we obtain for j E y v , y> vu {j) — 

min {min [a(i)+f uv (i, j)\ - c,ip uv (y v )+A vu (j)}. (61) 


Expression (60b) is a message passing for /, but the minimum is only 
over y u and the result is needed only for j E y v . This message can be 
computed in time 0(\y u | + \y v |) using the same algorithms [15, 28, 3] 
(see also non-uniform min-convolution in [52]). Evaluating (61) takes 
additional 0(|34|) time and minimum in (58) takes 0(\y u \) time 
assuming that components of a(i) are equal for j ^ y v (because it is 
already true for g and ip). □ 
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Appendix B 
Results Per Instance 


Instance 

Algorithm 

Time 
needed 
overall (s) 

Time for 
initial 
solution (s) 

iterations 
Algorithm 1,2 

iterations 

TRWS 

Logarithmic 

percentage 

partial 

optimality 

Percentage 

excluded 

labels 

ProteinFolding 

1CKK 

Our-CPLEX 

2757.62 

1177.62 

5 

t 

14.24% 

27.04% 


Our-TRWS 

5.76 

5.00 

3 

1000+15 

13.83% 

26.53% 


MQPBO-IO 

5670.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

825.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

2502.69 

2493.65 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

47.57 

30.19 

2 

288+185 

0.00% 

0.00% 

1CM1 

Our-CPLEX 

4070.00 

992.15 

7 

t 

8.38% 

34.28% 


Our-TRWS 

6.03 

4.70 

4 

1000+65 

8.07% 

29.98% 


MQPBO-IO 

5520.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

723.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

2388.46 

2198.04 

3 

t 

0.00% 

0.00% 


[43]-TRWS 

51.33 

21.60 

3 

242+358 

0.00% 

0.00% 

1SY9 

Our-CPLEX 

2628.72 

416.74 

5 

t 

25.34% 

51.30% 


Our-TRWS 

6.88 

5.50 

4 

1000+15 

28.06% 

57.98% 


MQPBO-IO 

7494.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

2112.00 

0.00 

0 

t 

0.00% 

0.11% 


[43]-CPLEX 

1067.46 

910.90 

4 

t 

0.00% 

0.00% 


[43]-TRWS 

66.73 

46.77 

5 

400+174 

0.00% 

0.00% 

2BBN 

Our-CPLEX 

9677.42 

5476.81 

5 

t 

2.12% 

8.58% 


Our-TRWS 

10.00 

8.60 

3 

1000+10 

2.64% 

14.17% 


MQPBO-IO 

1736.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

2429.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

9776.60 

9771.18 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

54.21 

42.90 

2 

242+146 

0.00% 

0.00% 

2BCX 

Our-CPLEX 

36222.90 

6998.66 

5 

t 

4.81% 

15.66% 


Our-TRWS 

9.14 

7.90 

3 

1000+55 

4.39% 

14.21% 


MQPBO-IO 

1008.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

1288.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

11419.60 

11409.90 

2 

t 

0.00% 

0.00% 


[43]-TRWS 

55.26 

39.60 

2 

252+194 

0.00% 

0.00% 

2BE6 

Our-CPLEX 

1381.60 

765.84 

4 

t 

9.14% 

17.68% 


Our-TRWS 

4.67 

3.91 

4 

1000+60 

8.96% 

15.12% 


MQPBO-IO 

3728.00 

0.00 

0 

t 

0.00% 

0.05% 


MQPBO 

540.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

1552.95 

1552.88 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

40.12 

28.12 

2 

363+230 

0.00% 

0.00% 

2F3Y 

Our-CPLEX 

3628.90 

2546.68 

5 

t 

6.22% 

10.66% 


Our-TRWS 

5.83 

5.20 

3 

1000+10 

8.39% 

13.74% 


MQPBO-IO 

5138.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

928.00 

0.00 

0 

t 

0.00% 

0.05% 


[43]-CPLEX 

4618.78 

4618.76 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

41.87 

33.03 

3 

321+164 

0.00% 

0.00% 

2FOT 

Our-CPLEX 

7458.75 

1996.55 

5 

t 

4.10% 

11.64% 


Our-TRWS 

6.25 

5.30 

4 

1000+25 

4.01% 

11.01% 


MQPBO-IO 

4961.00 

0.00 

0 

t 

0.00% 

0.09% 


MQPBO 

1054.00 

0.00 

0 

t 

0.00% 

0.07% 


[43]-CPLEX 

4473.58 

4440.51 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

61.92 

44.42 

2 

398+222 

0.00% 

0.00% 

2HQW 

Our-CPLEX 

5721.95 

1946.20 

6 

t 

10.30% 

17.30% 


Our-TRWS 

6.49 

4.80 

6 

1000+160 

8.33% 

18.08% 


MQPBO-IO 

7228.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

1193.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

2163.98 

2161.07 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

44.07 

35.46 

2 

382+121 

0.00% 

0.00% 

2060 

Our-CPLEX 

12085.40 

3007.95 

6 

t 

4.22% 

12.81% 


Our-TRWS 

7.74 

6.50 

3 

1000+55 

4.94% 

15.55% 


MQPBO-IO 

7516.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

1997.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

6137.07 

6128.14 

1 

t 

0.00% 

0.00% 


[43]-TRWS 

93.87 

46.42 

2 

352+369 

0.00% 

0.00% 

3BXL 

Our-CPLEX 

3247.11 

915.86 

7 

t 

4.97% 

17.18% 


Our-TRWS 

7.44 

5.90 

4 

1000+60 

4.66% 

12.35% 


MQPBO-IO 

6709.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

1291.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-CPLEX 

1776.23 

1598.07 

2 

t 

0.00% 

0.00% 


[43]-TRWS 

44.71 

25.52 

2 

227+216 

0.00% 

0.00% 

pdblb25 

Our-CPLEX 

1599.67 

55.01 

28 

t 

76.76% 

84.05% 


Our-TRWS 

5.18 

2.92 

18 

530+150 

83.00% 

87.84% 
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Instance 

Algorithm 

Time 
needed 
overall (s) 

Time for 
initial 
solution (s) 

iterations 
Algorithm 1,2 

iterations 

TRWS 

Logarithmic 

percentage 

partial 

optimality 

Percentage 

excluded 

labels 


MQPBO-IO 

27.00 

0.00 

0 

t 

0.00% 

2.53% 


MQPBO 

2.00 

0.00 

0 

t 

0.00% 

1.99% 


[43]-CPLEX 

324.64 

72.11 

14 

t 

18.84% 

22.32% 


[43]-TRWS 

119.71 

27.62 

14 

443+1238 

18.92% 

22.34% 

pdbld2e 

Our-CPLEX 

154.76 

25.44 

5 

t 

97.30% 

97.98% 


Our-TRWS 

1.67 

1.13 

7 

420+75 

96.97% 

98.25% 


MQPBO-IO 

12.00 

0.00 

0 

t 

0.00% 

4.61% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

2.74% 


[43]-CPLEX 

483.55 

34.59 

25 

t 

55.53% 

58.94% 


[43]-TRWS 

83.82 

6.16 

47 

190+2775 

55.69% 

58.98% 

pdblfmj 

Our-CPLEX 

99.33 

12.35 

7 

t 

92.58% 

94.90% 


Our-TRWS 

1.05 

0.60 

14 

540+135 

83.18% 

87.09% 


MQPBO-IO 

6.00 

0.00 

0 

t 

0.00% 

2.92% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

2.04% 


[43]-CPLEX 

77.30 

16.97 

11 

t 

15.94% 

18.83% 


[43]-TRWS 

16.67 

3.10 

11 

186+677 

16.18% 

18.91% 

pdbli24 

Our-TRWS 

0.06 

0.02 

2 

60+5 

99.73% 

99.94% 


MQPBO-IO 

3.00 

0.00 

0 

t 

0.00% 

2.85% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

3.43% 


[43]-CPLEX 

5.66 

5.66 

0 

t 

100.00% 

100.00% 


[43]-TRWS 

0.82 

0.82 

0 

115+0 

100.00% 

100.00% 

pdbliqc 

Our-CPLEX 

111.58 

18.51 

5 

t 

99.10% 

99.63% 


Our-TRWS 

0.74 

0.40 

8 

200+35 

96.39% 

97.10% 


MQPBO-IO 

8.00 

0.00 

0 

t 

0.00% 

6.06% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

4.90% 


[43]-CPLEX 

229.09 

24.36 

20 

t 

35.32% 

41.15% 


[43]-TRWS 

36.03 

4.06 

28 

169+2058 

40.50% 

45.56% 

pdbljmx 

Our-CPLEX 

142.20 

15.52 

9 

t 

97.24% 

98.69% 


Our-TRWS 

0.67 

0.29 

10 

200+75 

93.46% 

95.83% 


MQPBO-IO 

8.00 

0.00 

0 

t 

0.00% 

3.76% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

3.73% 


[43]-CPLEX 

121.71 

16.21 

19 

t 

35.86% 

39.98% 


[43]-TRWS 

20.02 

3.59 

24 

188+1098 

35.26% 

39.12% 

pdblkgn 

Our-CPLEX 

196.03 

17.97 

10 

t 

89.22% 

93.23% 


Our-TRWS 

1.37 

0.76 

9 

400+170 

88.92% 

93.16% 


MQPBO-IO 

9.00 

0.00 

0 

t 

0.00% 

3.24% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

2.27% 


[43]-CPLEX 

161.57 

24.37 

12 

t 

39.42% 

39.67% 


[43]-TRWS 

53.49 

6.45 

17 

268+1824 

13.20% 

13.36% 

pdblkwh 

Our-CPLEX 

105.77 

9.63 

10 

t 

79.09% 

85.64% 


Our-TRWS 

0.46 

0.27 

8 

440+50 

76.53% 

83.26% 


MQPBO-IO 

5.00 

0.00 

0 

t 

0.00% 

2.99% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

3.43% 


[43]-CPLEX 

51.33 

12.89 

9 

t 

25.54% 

31.15% 


[43]-TRWS 

9.15 

2.43 

8 

208+401 

25.43% 

31.13% 

pdblm3y 

Our-CPLEX 

73.60 

18.58 

3 

t 

98.54% 

99.47% 


Our-TRWS 

0.79 

0.65 

3 

340+10 

97.53% 

99.08% 


MQPBO-IO 

8.00 

0.00 

0 

t 

0.00% 

6.38% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

5.72% 


[43]-CPLEX 

120.60 

25.18 

14 

t 

31.05% 

27.97% 


[43]-TRWS 

28.19 

4.82 

12 

200+1135 

31.08% 

27.98% 

pdblqks 

Our-CPLEX 

138.12 

15.19 

8 

t 

98.30% 

98.93% 


Our-TRWS 

0.30 

0.12 

4 

80+20 

98.57% 

99.38% 


MQPBO-IO 

9.00 

0.00 

0 

t 

0.00% 

5.09% 


MQPBO 

0.00 

0.00 

0 

t 

0.00% 

3.68% 


[43]-CPLEX 

96.77 

15.82 

12 

t 

28.18% 

26.37% 


[43]-TRWS 

27.99 

3.24 

15 

161+1154 

30.63% 

28.46% 

color-seg 

colseg-cow3 

Our-TRWS 


48.10 

6 

1000+140 

99.96% 

99.97% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

99.89% 


MQPBO-IO 

206.00 

0.00 

0 

t 

0.00% 

43.55% 


MQPBO 

24.00 

0.00 

0 

t 

0.00% 

32.06% 


[43]-TRWS 

7530.72 

690.55 

14 

826+6056 

99.95% 

99.95% 

colseg-cow4 

Our-TRWS 

91.26 

48.31 

13 

1000+310 

99.92% 

99.93% 


Kovtun[24] 

2.00 

0.00 

0 

t 

0.00% 

99.90% 


MQPBO-IO 

46.00 

0.00 

0 

t 

0.00% 

0.56% 


MQPBO 

40.00 

0.00 

0 

t 

0.00% 

0.37% 


[43]-TRWS 

7395.03 

742.58 

10 

848+6349 

99.80% 

99.80% 

colseg-garden4 

Our-TRWS 

0.49 

0.15 

5 

70+20 

99.91% 

99.94% 


Kovtun[24] 

0.00 

0.00 

0 

t 

0.00% 

94.96% 


MQPBO-IO 

14.00 

0.00 

0 

t 

0.00% 

4.27% 


MQPBO 

1.00 

0.00 

0 

t 

0.00% 

0.21% 


[43]-TRWS 

33.68 

6.75 

5 

167+488 

99.89% 

99.89% 
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Instance 

Algorithm 

Time 
needed 
overall (s) 

Time for 
initial 
solution (s) 

iterations 
Algorithm 1,2 

iterations 

TRWS 

Logarithmic 

percentage 

partial 

optimality 

Percentage 

excluded 

labels 

color-seg-n4 

clownfish-small 

Our-TRWS 

1.72 

0.68 

3 

80+10 

>99.99% 

>99.99% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

74.11% 


MQPBO-10 

536.00 

0.00 

0 

t 

0.00% 

15.83% 


MQPBO 

41.00 

0.00 

0 

t 

0.00% 

4.67% 


[43]-TRWS 

151.98 

30.01 

6 

223+610 

99.97% 

99.97% 

crops-small 

Our-TRWS 

1.87 

1.02 

2 

120+5 

100.00% 

100.00% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

64.70% 


MQPBO-10 

577.00 

0.00 

0 

t 

0.00% 

14.32% 


MQPBO 

33.00 

0.00 

0 

t 

0.00% 

0.71% 


[43]-TRWS 

677.08 

34.88 

40 

260+3578 

99.00% 

99.00% 

fourcolors 

Our-TRWS 

0.57 

0.08 

2 

20+5 

99.96% 

99.97% 


Kovtun[24] 

0.00 

0.00 

0 

t 

0.00% 

69.52% 


MQPBO-10 

37.00 

0.00 

0 

t 

0.00% 

0.00% 


MQPBO 

3.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-TRWS 

31.28 

2.60 

8 

34+238 

99.92% 

99.92% 

lake-small 

Our-TRWS 

1.28 

0.43 

2 

50+5 

100.00% 

100.00% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

74.87% 


MQPBO-10 

607.00 

0.00 

0 

t 

0.00% 

15.31% 


MQPBO 

31.00 

0.00 

0 

t 

0.00% 

6.65% 


[43]-TRWS 

13.75 

13.75 

0 

95+-95 

100.00% 

100.00% 

palm-small 

Our-TRWS 

2.48 

1.37 

3 

160+10 

>99.99% 

>99.99% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

68.65% 


MQPBO-10 

510.00 

0.00 

0 

t 

0.00% 

0.48% 


MQPBO 

19.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-TRWS 

846.27 

39.97 

19 

291+4582 

98.20% 

98.20% 

penguin-small 

Our-TRWS 

1.21 

0.54 

2 

90+5 

100.00% 

100.00% 


Kovtun[24] 

0.00 

0.00 

0 

t 

0.00% 

91.99% 


MQPBO-10 

193.00 

0.00 

0 

t 

0.00% 

1.42% 


MQPBO 

13.00 

0.00 

0 

t 

0.00% 

1.03% 


[43]-TRWS 

15.67 

15.67 

0 

152+-152 

100.00% 

100.00% 

pfau-small 

Our-TRWS 

18.77 

7.22 

48 

950+470 

89.43% 

93.41% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

5.59% 


MQPBO-10 

591.00 

0.00 

0 

t 

0.00% 

0.70% 


MQPBO 

16.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-TRWS 

799.08 

79.34 

44 

654+10857 

10.43% 

10.43% 

snail 

Our-TRWS 

0.79 

0.23 

2 

50+5 

99.99% 

99.99% 


Kovtun[24] 

0.00 

0.00 

0 

t 

0.00% 

97.77% 


MQPBO-10 

7.00 

0.00 

0 

t 

0.00% 

77.91% 


MQPBO 

1.00 

0.00 

0 

t 

0.00% 

58.35% 


[43]-TRWS 

46.20 

6.47 

5 

83+332 

99.98% 

99.98% 

strawberry-glass-2-small 

Our-TRWS 

1.35 

0.60 

2 

80+5 

100.00% 

100.00% 


Kovtun[24] 

1.00 

0.00 

0 

t 

0.00% 

54.99% 


MQPBO-10 

528.00 

0.00 

0 

t 

0.00% 

2.78% 


MQPBO 

39.00 

0.00 

0 

t 

0.00% 

0.00% 


[43]-TRWS 

311.54 

31.00 

11 

259+1721 

99.31% 

99.31% 

mrf-photomontage 

family-gm 

Our-TRWS 

286.40 

93.08 

77 

1000+1265 

4.75% 

4.80% 


MQPBO-10 

1087.00 

0.00 

0 

t 

0.00% 

4.41% 


MQPBO 

90.00 

0.00 

0 

t 

0.00% 

4.34% 


[43]-TRWS 

12726.45 

1291.11 

50 

1015+22483 

4.41% 

4.41% 

pano-gm 

Our-TRWS 

320.00 

112.17 

59 

1000+1105 

67.73% 

79.17% 


MQPBO-10 

646.00 

0.00 

0 

t 

0.00% 

28.06% 


MQPBO 

97.00 

0.00 

0 

t 

0.00% 

40.37% 


[43]-TRWS 

14360.45 

1871.14 

33 

911+11193 

27.55% 

27.55% 

mrf-stereo 

ted-gm 

Our-TRWS 

231.97 

72.67 

119 

1000+715 

67.27% 

72.05% 


[43]-TRWS 

3837.51 

436.30 

28 

689+10383 

38.13% 

38.13% 

tsu-gm 

Our-TRWS 

19.75 

14.67 

10 

670+75 

99.91% 

99.94% 


[43]-TRWS 

9277.99 

267.55 

54 

377+17421 

0.39% 

0.39% 

ven-gm 

Our-TRWS 

108.73 

94.44 

9 

1000+40 

0.01% 

0.02% 


[43]-TRWS 

14737.47 

1451.83 

55 

993+16592 

0.00% 

0.00% 


TABLE 7: Detailed experimental evaluation for Algorithm 1 utilising CPLEX [17] as a subsolver, denoted as Our-CPLEX, Algorithm 2 
utilising TRW-S [20] as a subsolver, denoted as Our-TRWS, their counterparts from [43] denoted by [43] -CPLEX and [43] -TRWS and 
MQPBO [19] run for one iteration with predefined label order, denoted by MQPBO, and run 10 iterations in 10 random label orders, denoted 

by MQPBO-10. 


20 



